Prologue: A machine learning sampler

نویسنده

  • Peter Flach
چکیده

Y OU MAY NOT be aware of it, but chances are that you are already a regular user of machine learning technology. Most current e-mail clients incorporate algorithms to identify and filter out spam e-mail, also known as junk e-mail or unsolicited bulk e-mail. Early spam filters relied on hand-coded pattern matching techniques such as regular expressions, but it soon became apparent that this is hard to maintain and offers insufficient flexibility – after all, one person’s spam is another person’s ham!1 Additional adaptivity and flexibility is achieved by employing machine learning techniques. SpamAssassin is a widely used open-source spam filter. It calculates a score for an incoming e-mail, based on a number of built-in rules or ‘tests’ in SpamAssassin’s terminology, and adds a ‘junk’ flag and a summary report to the e-mail’s headers if the score is 5 or more. Here is an example report for an e-mail I received:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introduction to Machine Learning & Case-Based Reasoning

Prologue These notes refer to the course of Machine Learning (course 395), Computing Department, Imperial College London. The goal of this syllabus is to summarize the basics of machine learning and to provide a detailed explanation of case-based reasoning. This chapter introduces the term " machine learning " and defines what do we mean while using this term. This is a very short summary of th...

متن کامل

Effective Function Recovery for COTS Binaries using Interface Verification

Function recovery is a critical step in many binary analysis and instrumentation tasks. Existing approaches rely on commonly used function prologue patterns to recognize function starts, and possibly epilogues for the ends. However, this approach is not robust when dealing with different compilers, compiler versions, and compilation switches. Although machine learning techniques have been propo...

متن کامل

A Gibbs Sampler for Learning DAGs

We propose a Gibbs sampler for structure learning in directed acyclic graph (DAG) models. The standard Markov chain Monte Carlo algorithms used for learning DAGs are random-walk Metropolis-Hastings samplers. These samplers are guaranteed to converge asymptotically but often mix slowly when exploring the large graph spaces that arise in structure learning. In each step, the sampler we propose dr...

متن کامل

Learning When to Reject an Importance Sample

When observations are incomplete or data are missing, approximate inference methods based on importance sampling are often used. Unfortunately, when the target and proposal distributions are dissimilar, the sampling procedure leads to biased estimates or requires a prohibitive number of samples. Our method approximates a multivariate target distribution by sampling from an existing, sequential ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015